Integrating the energy information into MFCC

نویسندگان

Fang Zheng

Guoliang Zhang

چکیده

The Mel-Frequency Cepstrum Coefficients (MFCC) is a widely used set of feature used in automatic speech recognition systems introduced in 1980 by Davis and Mermelstein [2]. In this traditional implementation, the 0 coefficient is excluded for the reason it is somewhat unreliable. In this paper, we analyze this term and find that it can be regarded as the generalized frequency band energy (FBE) and is hence useful, resulting in the FBE-MFCC. We also propose a better analysis, called the auto-regressive analysis, on the frame energy, which performs better than its 1 and/or 2 order differential derivatives. Experiments show that, the FBE-MFCC and the frame energy with their corresponding auto-regressive analysis coefficients form the better combination reducing the syllable error rate (SER) by 10.0% across a giant speech database, compared to the traditional MFCC with its corresponding autoregressive analysis coefficients.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Integrating information of the efficient and anti-efficient frontiers in DEA analysis to assess location of solar plants: A case study in Iran

The solar photovoltaic (PV) energy is one of the most promising sources of energy, which has attracted many interests. Itis potentially the largest source of energy in the world and is capable to mitigategreenhouse gas (GHG) emissions significantly in comparison with fossil fuels.Location optimization of solar plants can play a vital role to rise the efficiency and performance of the solar PV s...

متن کامل

Integrating Complementary Features from Vocal Source and Vocal Tract for Speaker Identification

This paper describes a speaker identification system that uses complementary acoustic features derived from the vocal source excitation and the vocal tract system. Conventional speaker recognition systems typically adopt the cepstral coefficients, e.g., Mel-frequency cepstral coefficients (MFCC) and linear predictive cepstral coefficients (LPCC), as the representative features. The cepstral fea...

متن کامل

High Improvement of Speaker Identification and Verification by Combining Mfcc and Phase Information

In conventional speaker recognition methods based on MFCC, phase information has been ignored. We proposed a method that integrated the phase information with MFCC on a speaker identification method, and a preliminary experiment was performed. In this paper, we propose a new modified feature parameter (that is, coordidates on an unit circle) obtained from the original phase information, and eva...

متن کامل

Improved phoneme recognition by integrating evidence from spectro-temporal and cepstral features

Gabor features have been proposed for extracting spectro-temporal modulation information, and yielding significant improvements in recognition performance. In this paper, we propose the integration of Gabor posteriors with MFCC posteriors, yielding a relative improvement of 14.3% over an MFCC Tandem system. We analyze for different types of acoustic units the complementarity between Gabor featu...

متن کامل

Integrating Complementary Features with a Confidence Measure for Speaker Identification

This paper investigates the effectiveness of integrating complementary acoustic features for improved speaker identification performance. The complementary contributions of two acoustic features, i.e. the conventional vocal tract related features MFCC and the recently proposed vocal source related features WOCOR, for speaker identification are studied. An integrating system, which performs a sc...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2000

Integrating the energy information into MFCC

نویسندگان

چکیده

منابع مشابه

Integrating information of the efficient and anti-efficient frontiers in DEA analysis to assess location of solar plants: A case study in Iran

Integrating Complementary Features from Vocal Source and Vocal Tract for Speaker Identification

High Improvement of Speaker Identification and Verification by Combining Mfcc and Phase Information

Improved phoneme recognition by integrating evidence from spectro-temporal and cepstral features

Integrating Complementary Features with a Confidence Measure for Speaker Identification

عنوان ژورنال:

اشتراک گذاری